10. Training the Network
07 Training The Network V1
Cross-Entropy Loss
In the PyTorch documentation , you can see that the cross entropy loss function actually involves two steps:
- It first applies a softmax function to any output is sees
- Then applies NLLLoss ; negative log likelihood loss
Then it returns the average loss over a batch of data. Since it applies a softmax function, we
do not
have to specify that in the
forward
function of our model definition, but we could do this another way.
Another approach
We could separate the softmax and NLLLoss steps.
-
In the
forward
function of our model, we would explicitly apply a softmax activation function to the output,x
.
...
...
# a softmax layer to convert 10 outputs into a distribution of class probabilities
x = F.log_softmax(x, dim=1)
return x
- Then, when defining our loss criterion, we would apply NLLLoss
# cross entropy loss combines softmax and nn.NLLLoss() in one single class
# here, we've separated them
criterion = nn.NLLLoss()
This separates the usual
criterion = nn.CrossEntropy()
into two steps: softmax and NLLLoss, and is a useful approach should you want the output of a model to be class
probabilities
rather than class scores.